Search Result

Journals

Publication Years

Keywords

Please wait a minute...

For Selected:

Download Citations
EndNote Ris BibTeX

Toggle Thumbnails

Select

Online task scheduling algorithm for big data analytics based on cumulative running work

LI Yefei, XU Chao, XU Daoqiang, ZOU Yunfeng, ZHANG Xiaoda, QIAN Zhuzhong

Journal of Computer Applications 2019, 39 (8): 2431-2437. DOI: 10.11772/j.issn.1001-9081.2019010073

Abstract （390）

PDF （1056KB）（248）

Save

A Cumulative Running Work (CRW) based task scheduler CRWScheduler was proposed to effectively process tasks without any prior knowledge for big data analytics platform like Hadoop and Spark. The running job was moved from a low-weight queue to a high-weight one based on CRW. When resources were allocated to a job, both the queue of the job and the instantaneous resource utilization of the job were considered, significantly improving the overall system performance without prior knowledge. The prototype of CRWScheduler was implemented based on Apache Hadoop YARN. Experimental results on 28-node benchmark testing cluster show that CRWScheduler reduces average Job Flow Time (JFT) by 21% and decreases JFT of 95th percentile by up to 35% compared with YARN fair scheduler. Further improvements can be obtained when CRWScheduler cooperates with task-level schedulers.

Reference | Related Articles | Metrics